5,398 research outputs found

    Use of Wikipedia Categories in Entity Ranking

    Get PDF
    Wikipedia is a useful source of knowledge that has many applications in language processing and knowledge representation. The Wikipedia category graph can be compared with the class hierarchy in an ontology; it has some characteristics in common as well as some differences. In this paper, we present our approach for answering entity ranking queries from the Wikipedia. In particular, we explore how to make use of Wikipedia categories to improve entity ranking effectiveness. Our experiments show that using categories of example entities works significantly better than using loosely defined target categories

    Enhancing Content-And-Structure Information Retrieval using a Native XML Database

    Get PDF
    Three approaches to content-and-structure XML retrieval are analysed in this paper: first by using Zettair, a full-text information retrieval system; second by using eXist, a native XML database, and third by using a hybrid XML retrieval system that uses eXist to produce the final answers from likely relevant articles retrieved by Zettair. INEX 2003 content-and-structure topics can be classified in two categories: the first retrieving full articles as final answers, and the second retrieving more specific elements within articles as final answers. We show that for both topic categories our initial hybrid system improves the retrieval effectiveness of a native XML database. For ranking the final answer elements, we propose and evaluate a novel retrieval model that utilises the structural relationships between the answer elements of a native XML database and retrieves Coherent Retrieval Elements. The final results of our experiments show that when the XML retrieval task focusses on highly relevant elements our hybrid XML retrieval system with the Coherent Retrieval Elements module is 1.8 times more effective than Zettair and 3 times more effective than eXist, and yields an effective content-and-structure XML retrieval

    Users and Assessors in the Context of INEX: Are Relevance Dimensions Relevant?

    Get PDF
    The main aspects of XML retrieval are identified by analysing and comparing the following two behaviours: the behaviour of the assessor when judging the relevance of returned document components; and the behaviour of users when interacting with components of XML documents. We argue that the two INEX relevance dimensions, Exhaustivity and Specificity, are not orthogonal dimensions; indeed, an empirical analysis of each dimension reveals that the grades of the two dimensions are correlated to each other. By analysing the level of agreement between the assessor and the users, we aim at identifying the best units of retrieval. The results of our analysis show that the highest level of agreement is on highly relevant and on non-relevant document components, suggesting that only the end points of the INEX 10-point relevance scale are perceived in the same way by both the assessor and the users. We propose a new definition of relevance for XML retrieval and argue that its corresponding relevance scale would be a better choice for INEX

    Hybrid XML Retrieval: Combining Information Retrieval and a Native XML Database

    Get PDF
    This paper investigates the impact of three approaches to XML retrieval: using Zettair, a full-text information retrieval system; using eXist, a native XML database; and using a hybrid system that takes full article answers from Zettair and uses eXist to extract elements from those articles. For the content-only topics, we undertake a preliminary analysis of the INEX 2003 relevance assessments in order to identify the types of highly relevant document components. Further analysis identifies two complementary sub-cases of relevance assessments ("General" and "Specific") and two categories of topics ("Broad" and "Narrow"). We develop a novel retrieval module that for a content-only topic utilises the information from the resulting answer list of a native XML database and dynamically determines the preferable units of retrieval, which we call "Coherent Retrieval Elements". The results of our experiments show that -- when each of the three systems is evaluated against different retrieval scenarios (such as different cases of relevance assessments, different topic categories and different choices of evaluation metrics) -- the XML retrieval systems exhibit varying behaviour and the best performance can be reached for different values of the retrieval parameters. In the case of INEX 2003 relevance assessments for the content-only topics, our newly developed hybrid XML retrieval system is substantially more effective than either Zettair or eXist, and yields a robust and a very effective XML retrieval.Comment: Postprint version. The editor version can be accessed through the DO

    Enhancing IT Architect capabilities: Experiences within a university subject

    Get PDF
    The role of IT Architect is important in the development and successful implementation of Information Technology systems across the world. The people performing the role are critical to the success of the systems. This paper reports on the results of an experiment aimed at developing two key IT architect capabilities within the context of a post graduate Systems Architecture subject. One capability is related to problem solving and while surprisingly student problem solving confidence was impacted other aspects of problem solving important for IT Architects were unchanged. The other capability being researched, future time orientation was also unchanged through intervention. Therefore alternative approaches for improving these capabilities are preferable as factors such as external pressures on the students within the semester outweighed any short term capability improvement

    Entity Ranking in Wikipedia

    Get PDF
    The traditional entity extraction problem lies in the ability of extracting named entities from plain text using natural language processing techniques and intensive training from large document collections. Examples of named entities include organisations, people, locations, or dates. There are many research activities involving named entities; we are interested in entity ranking in the field of information retrieval. In this paper, we describe our approach to identifying and ranking entities from the INEX Wikipedia document collection. Wikipedia offers a number of interesting features for entity identification and ranking that we first introduce. We then describe the principles and the architecture of our entity ranking system, and introduce our methodology for evaluation. Our preliminary results show that the use of categories and the link structure of Wikipedia, together with entity examples, can significantly improve retrieval effectiveness.Comment: to appea

    Open-source development experiences in scientific software: the HANDE quantum Monte Carlo project

    Full text link
    The HANDE quantum Monte Carlo project offers accessible stochastic algorithms for general use for scientists in the field of quantum chemistry. HANDE is an ambitious and general high-performance code developed by a geographically-dispersed team with a variety of backgrounds in computational science. In the course of preparing a public, open-source release, we have taken this opportunity to step back and look at what we have done and what we hope to do in the future. We pay particular attention to development processes, the approach taken to train students joining the project, and how a flat hierarchical structure aids communicationComment: 6 pages. Submission to WSSSPE

    Evaluating Focused Retrieval Tasks

    Get PDF
    International audienceFocused retrieval, identified by question answering, passage retrieval, and XML element retrieval, is becoming increasingly important within the broad task of information retrieval. In this paper, we present a taxonomy of text retrieval tasks based on the structure of the answers required by a task. Of particular importance are the in context tasks of focused retrieval, where not only relevant documents should be retrieved but also relevant information within each document should be correctly identified. Answers containing relevant information could be, for example, best entry points, or non-overlapping passages or elements. Our main research question is: How should the effectiveness of focused retrieval be evaluated? We propose an evaluation framework where different aspects of the in context focused retrieval tasks can be consistently evaluated and compared, and use fidelity tests on simulated runs to show what is measured. Results from our fidelity experiments demonstrate the usefulness of the proposed evaluation framework, and show its ability to measure different aspects and model different evaluation assumptions of focused retrieval

    Fluid structure interaction of patient specific abdominal aortic aneurysms: a comparison with solid stress models

    Get PDF
    BACKGROUND: Abdominal aortic aneurysm (AAA) is a dilatation of the aortic wall, which can rupture, if left untreated. Previous work has shown that, maximum diameter is not a reliable determinant of AAA rupture. However, it is currently the most widely accepted indicator. Wall stress may be a better indicator and promising patient specific results from structural models using static pressure, have been published. Since flow and pressure inside AAA are non-uniform, the dynamic interaction between the pulsatile flow and wall may influence the predicted wall stress. The purpose of the present study was to compare static and dynamic wall stress analysis of patient specific AAAs. METHOD: Patient-specific AAA models were created from CT scans of three patients. Two simulations were performed on each lumen model, fluid structure interaction (FSI) model and static structural (SS) model. The AAA wall was created by dilating the lumen with a uniform 1.5 mm thickness, and was modeled as a non-linear hyperelastic material. Commercial finite element code Adina 8.2 was used for all simulations. The results were compared between the FSI and SS simulations. RESULTS: Results are presented for the wall stress patterns, wall shear stress patterns, pressure, and velocity fields within the lumen. It is demonstrated that including fluid flow can change local wall stresses slightly. However, as far as the peak wall stress is concerned, this effect is negligible as the difference between SS and FSI models is less than 1%. CONCLUSION: The results suggest that fully coupled FSI simulation, which requires considerable computational power to run, adds little to rupture risk prediction. This justifies the use of SS models in previous studies
    • …
    corecore